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1 A case for fractured mirrors I I 

Ravishankar Ramamurthy, David J. DeWitt, Qi Su 

August 2003 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 12 Issue 2 

Full text available: ^ pdf(200.49 KB) Additional Information: full citation , abstract 

Abstract.The decomposition storage model (DSM) vertically partitions all attributes of a table 
and has excellent I/O behavior when the number of attributes accessed by a query is small. 
It also has a better cache footprint than the standard storage model (NSM) used by most 
database systems. However, DSM incurs a high cost in reconstructing the original tuple from 
its partitions. We first revisit some of the performance problems associated with DSM and 
suggest a simple indexing strategy and compa ... 



Keywords: Data placement, Disk mirroring, Vertical partitioning 



2 Run-time modeling and estimation of operating system power consumption I I 

Tao Li, Lizy Kurian John 
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2003 ACM SIGMETRICS international conference on Measurement and 
modeling of computer systems, volume 31 issue i 

Full text available: ^ pdf(233.33 KB) Additional Information: full citation , abstract , references , index terms 

The increasing constraints on power consumption in many computing systems point to the 
need for power modeling and estimation for all components of a system. The Operating 
System (OS) constitutes a major software component and dissipates a significant portion of 
total power in many modern application executions. Therefore, modeling OS power is 
imperative for accurate software power evaluation, as well as power management (e.g. 
dynamic thermal control and equal energy scheduling) in the light of ... 

Keywords: low power, operating system, power estimation 
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A modeling study of the TPC-C benchmark 
Scott T. Leutenegger, Daniel Dias 

June 1993 ACM SIGMOD Record , Proceedings of the 1993 ACM SIGMOD international 
conference on Management of data, volume 22 issue 2 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: f | pdf(1.13 MB) 



The TPC-C benchmark is a new benchmark approved by the TPC council intended for 
comparing database platforms running a medium complexity transaction processing 
workload. Some key aspects in which this new benchmark differs from the TPC-A benchmark 
are in having several transaction types, some of which are more complex than that in TPC- 
A, and in having data access skew. In this paper we present results from a modelling study 
of the TPC-C benchmark for both single node and distributed databas ... 

2 Dynamic page placement to improve locality in CC-NUMA multiprocessors for TPC-C Q 
Kenneth M. Wilson, Bob B. Aglietti 

November 2001 Proceedings of the 2001 ACM/IEEE conference on Supercomputing 
(CDROM) 

Full text available: ^ pdf(828.19 KB) Additional Information: full citation , abstract , references 

The use of CC-NUMA multiprocessors complicates the placement of physical memory pages. 
Memory closest to a processor provides the best access time, but optimal memory page 
placement is a difficult problem with process movement, multiple processes requiring access 
to the same physical memory page, and application behavior changing over execution time. 
We use dynamic page placement to move memory pages where needed for the database 
benchmark TPC-C executing on a four node CC-NUMA multiprocessor. D ... 

Keywords: CC-NUMA, TPC-C, dynamic page placement, migration, multiprocessor, 
replication 



3 Order-of-magnitude advantage on TPC-C through massive parallelism 
Charles Levine 

May 1995 ACM SIGMOD Record , Proceedings of the 1995 ACM SIGMOD international 

conference on Management of data, volume 24 issue 2 
Full text available: ^ pdfd 69.02 KB) Additional Information: full citation , abstract , index terms 

TPC Benchmark™ C (TPC-C) is the modern standard for measuring OLTP performance. 
Running TPC-C, Tandem demonstrated a massively parallel configuration of 112 CPUs which 
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achieved ten times higher performance than any other system previously measured (and 
today is still better by a factor of five). This result qualifies as the largest industry-standard 
benchmark ever run.This paper briefly describes how the benchmark was configured and the 
results which were obtained. 

4 I/O reference behavior of production database workloads and the TPC benchmarks — I I 

an analysis at the logical level 

Windsor W. Hsu, Alan Jay Smith, Honesty C. Young 

March 2001 ACM Transactions on Database Systems (TODS), Volume 26 issue l 

Full text available: ^pdf(5.42 MB) Additional Information: full citation , abstract , references , index terms 

As improvements in processor performance continue to far outpace improvements in storage 
performance, I/O is increasingly the bottleneck in computer systems, especially in large 
database systems that manage huge amoungs of data. The key to achieving good I/O 
performance is to thoroughly understand its characteristics. In this article we present a 
comprehensive analysis of the logical I/O reference behavior of the peak productiondatabase 
workloads from ten of the world's largest corporatio ... 

Keywords: I/O, TPC benchmarks, caching, locality, prefetching, production database 
workloads, reference behavior, sequentially, workload characterization 



Goal-oriented buffer management revisited I I 

Kurt P. Brown, Michael J. Carey, Miron Livny 

June 1996 ACM SIGMOD Record , Proceedings of the 1996 ACM SIGMOD international 

conference on Management of data, volume 25 issue 2 
Full text available* pdf(l,56 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

In this paper we revisit the problem of achieving multi-class workload response time goals 
by automatically adjusting the buffer memory allocations of each workload class. We discuss 
the virtues and limitations of previous work with respect to a set of criteria we lay out for 
judging the success of any goal-oriented resource allocation algorithm. We then introduce 
the concept of hit rate concavity and develop a new goal-oriented buffer allocation 
algorithm, called Class Fencing, th ... 

Database buffer size investi g ation for OLTP workloads I I 

Thin-Fong Tsuei, Allan N. Packer, Keng-Tai Ko 

June 1997 ACM SIGMOD Record , Proceedings of the 1997 ACM SIGMOD international 
conference on Management of data, Volume 26 issue 2 

Full text available* f^l Ddfd 35 MB) Additional Information: full citation , abstract , references , citings , index 
' 12J- R - U terms 

It is generally accepted that On-Line Transaction Processing (OLTP) systems benefit from 
large database memory buffers. As enterprise database systems become larger and more 
complex, hardware vendors are building increasingly large systems capable of supporting 
huge memory configurations. Database vendors in turn are developing buffer schemes to 
exploit this physical memory. How much will these developments benefit OLTP workloads? 
Through empirical studies on databases sized comp ... 

Improving cache performance with balanced tag and data paths Q 
Jih-Kwon Peir, Windsor W. Hsu, Honesty Young, Shauchi Ong 

September 1996 Proceedings of the seventh international conference on Architectural 
support for programming languages and operating systems, Volume 31 , 
30 Issue 9 , 5 

Full text available* 1 Bpdf(1.07 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms 
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There are two concurrent paths in a typical cache access — one through the data array and 
the other through the tag array. The path through the data array drives the selected set out 
of the array. The path through the tag array determines cache hit/miss and, for set- 
associative caches, selects the appropriate line from within the selected set. In both direct- 
mapped and set-associative caches, the path through the tag array is significantly longer 
than that through the data array. In this paper ... 

8 Performance characterization of a Quad Pentium Pro SMP using OLTP workloads I I 
Kimberly Keeton, David A. Patterson, Yong Qiang He, Roger C. Raphael, Walter E. Baker 

April 1998 ACM SIGARCH Computer Architecture News , Proceedings of the 25th 

annual international symposium on Computer architecture, volume 26 issue 3 
Full text available: ^ pdf(1 58 mB)^ P Additional Information: full citation , abstract , references , citings , index 
Publisher Site t^OIis 

Commercial applications are an important, yet often overlooked, workload with significantly 
different characteristics from technical workloads. The potential impact of these differences 
is that computers optimized for technical workloads may not provide good performance for 
commercial applications, and these applications may not fully exploit advances in processor 
design. To evaluate these issues, we use hardware counters to measure architectural 
features of a four-processor Pentium Pro-based se ... 

9 Capturing dynamic memory reference behavior with adaptive cache topology I I 
Jih-Kwon Peir, Yongjoon Lee, Windsor W. Hsu 

October 1998 Proceedings of the eighth international conference on Architectural 

support for programming languages and operating systems, Volume 33 , 32 
Issue 11 , 5 

Full text available: f Spdf(1.50 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms 

Memory references exhibit locality and are therefore not uniformly distributed across the 
sets of a cache. This skew reduces the effectiveness of a cache because it results in the 
caching of a considerable number of less-recently-used lines which are less likely to be re- 
referenced before they are replaced. In this paper, we describe a technique that dynamically 
identifies these less-recently-used lines and effectively utilizes the cache frames they occupy 
to more accurately approximate the glob ... 

10 Experiences with VI communication for database storage I I 
Yuanyuan Zhou, Angelos Bilas, Suresh Jagannathan, Cezary Dubnicki, James F. Philbin, Kai Li 

May 2002 ACM SIGARCH Computer Architecture News, Volume 30 issue 2 

Full text available: g pdf(1,29 MB) Additional Information: full citation , abstract , references , citings , index 
Publisher Site terms 

This paper examines how Vl-based interconnects can be used to improve I/O path 
performance between a database server and the storage subsystem. We design and 
implement a software layer, DSA, that is layered between the application and VI. DSA takes 
advantage of specific VI features and deals with many of its shortcomings. We provide and 
evaluate one kernel-level and two user-level implementations of DSA. These 
implementations trade transparency and generality for performance at different degrees ... 

Keywords: Storage system, cluster-based storage, Database storage, storage area 
network, User-level Communication, Virtual Interface Architecture, processor overhead 

11 A permutation-based page interleaving scheme to reduce row-buffer conflicts and I | 
exploit data locality 

Zhao Zhang, Zhichun Zhu, Xiaodong Zhang 
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December 2000 Proceedings of the 33rd annual ACM/IEEE international symposium on 

M i c roa rc h i tect u re 

Full text available: fS pdfd 53.06 KB) 

ps(856.21 KB) Additional Information: full citation , references , citings , index terms 



12 A methodology for auto-recognizing DBMS workloads 
Said S. Elnaffar 

September 2002 Proceedings of the 2002 conference of the Centre for Advanced 
Studies on Collaborative research 

Full text available: Q pdf(332.94 KB) Additional Information: full citation , abstract , references , index terms 

The type of the workload on a database management system (DBMS) is a key consideration 
in tuning the system. Allocations for resources such as main memory can be very different 
depending on whether the workload type is Online Transaction Processing (OLTP) or 
Decision Support System (DSS). A DBMS also typically experiences changes in the type of 
workload it handles during its normal processing cycle. Database administrators must, 
therefore, recognize the significant shifts of workload type that d ... 



13 Backtrack programming in welded girder design I I 

Albert D. M. Lewis 

July 1968 Proceedings of the fifth annual 1968 design automation workshop on 
Design automation 

Full text available- S pdf(527 84 KB) Adc,itiona l Information: full citation , abstract , references , citings , index 

" terms 

The object of engineering design is to satisfy some need of man with the maximization or 
minimization of some measure of effectiveness of the solution. Common measures of 
effectiveness are cost, cost-benefit ratio, and profit. In mathematical terminology an object 
or facility can be described by a list or vector of parameter values. The position of each 
element in the vector associates it with a particular parameter. The performance of the 
object or facility and the constraints imposed on t ... 



Modeling methodology: Facilitating level three cache studies using set sampling 
Niki C. Thornock, J. Kelly Flanagan 

December 2000 Proceedings of the 32nd conference on Winter simulation 

Full text available: * Qpdf(1 03.33 KB) Additional Information: full citation , abstract , references 

We discuss some of the difficulties present in trace collection and trace-driven cache 
simulation. We then describe our multiprocessor tracing technique and verify that it 
accurately collects long traces. We propose sampling as a method to reduce required disk 
space, enable simulations to run faster, and effectively enlarge the trace buffer of our 
hardware monitor, decreasing trace distortion. To this end, we investigate time sampling 
and two types of set sampling. We conclude that the second se ... 



15 New TPC benchmarks for decision support and web commerce 
Meikel Poess, Chris Floyd 

December 2000 ACM SIGMOD Record, Volume 29 issue 4 

Full text available: ^ pdf(686.16 KB) Additional Information: full citation , abstract , citings , index terms 

For as long as there have been DBMS's and applications that use them, there has been 
interest in the performance characteristics that these systems exhibit. This month's column 
describes some of the recent work that has taken place in TPC, the Transaction Processing 
Performance Council.TPC-A and TPC-B are obsolete benchmarks that you might have heard 
about in the past. TPC-C V3.5 is the current benchmark for OLTP systems. Introduced in 
1992, it has been run on many hardware platforms and DBMS's. ... 
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16 Configuring buffer pools in DB2 UDB I I 
Xiaoyi Xu, Patrick Martin, Wendy Powley 

September 2002 Proceedings of the 2002 conference of the Centre for Advanced 
Studies on Collaborative research 

Full text available: ^ pdf(96.74 KB) Additional Information: full citation , abstract , references , index terms 

Database Management Systems (DBMSs) use a main memory area as a buffer to reduce the 
number of disk accesses performed by a transaction. DB2 Universal Database divides the 
buffer area into a number of independent buffer pools and each database object (table or 
index) is assigned to a specific buffer pool. The tasks of configuring the buffer pools, which 
defines the mapping of database objects to buffer pools and setting a size for each of the 
buffer pools, is crucial for achieving optimal perfor ... 

17 Contrasting characteristics and cache performance of technical and multi-user I I 
commercial workloads 

Ann Marie Grizzaffi Maynard, Colette M. Donnelly, Bret R. Olszewski 

November 1994 Proceedings of the sixth international conference on Architectural 

support for programming languages and operating systems, Volume 29 , 

28 Issue 11 , 5 

Full text available- Ipl pdf(1.35 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

Experience has shown that many widely used benchmarks are poor predictors of the 
performance of systems running commercial applications. Research into this anomaly has 
long been hampered by a lack of address traces from representative multi-user commercial 
workloads. This paper presents research, using traces of industry-standard commercial 
benchmarks, which examines the characteristic differences between technical and 
commercial workloads and illustrates how those differences affect cache ... 

Keywords: cache performance, commercial workloads, memory subsystems, operating 
system activity, technical applications 



Performance modeling study of a client/server system architecture 
Ji Shen, Shahla Butler 

December 1994 Proceedings of the 26th conference on Winter simulation 

Full text available: g pdf(639.14 KB) Additional Information: full citation , references , citings , index terms 



19 An analytical model for buffer hit rate prediction I I 
Yongli Xi, Patrick Martin, Wendy Powley 

November 2001 Proceedings of the 2001 conference of the Centre for Advanced Studies 
on Collaborative research 

Full text available: ^jj pdfd 00.79 KB) Additional Information: full citation , abstract , references , index terms 

Of the many tuning parameters available in a database management system (DBMS), one of 
the most crucial to performance is the buffer pool size. Choosing an appropriate size, 
however, can be a difficult task. In this paper we present an analytical modeling approach to 
predicting the buffer pool hit rate that can be used to simplify the process of buffer pool 
sizing. A Markov Chain model is used to estimate the hit rate for buffer pools in IBM's DB2 
Universal Database. We present and experimental ... 

20 Q 

Energy aware design: Optimizing pipelines for power and performance 

Viji Srinivasan, David Brooks, Michael Gschwind, Pradip Bose, Victor Zyuban, Philip N. 



http://portalbeta.acm.org/resultsxfin?coll=ACM«&dl=ACM&CFID=14749062&CFTOKEN-... 12/8/03 



Results (page 1): CPW and TfiC-C 



Page 6 of 6 



Strenski, Philip G. Emma 

November 2002 Proceedings of the 35th annual ACM/IEEE international symposium on 
M ic roa rc h itect u re 

Full text available: ^pdf(1.24 MB) Additional Information: full citation , abstract , references , index terms 

During the concept phase and definition of next generation high-end processors, power and 
performance will need to be weighted appropriately to deliver competitive cost/performance. 
It is not enough to adopt a CPI-centric view alone in early-stage definition studies. One of 
the fundamental issues confronting the architect at this stage is the choice of pipeline depth 
and target frequency. In this paper we present an optimization methodology that starts with 
an analytical power-performance model ... 
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* ABSTRACT 

The TPC-C benchmark is a new benchmark approved by the TPC council intended for comparing 
database platforms running a medium complexity transaction processing workload. Some key aspects 
in which this new benchmark differs from the TPC-A benchmark are in having several transaction 
types, some of which are more complex than that in TPC-A, and in having data access skew. In this 
paper we present results from a modelling study of the TPC-C benchmark for both single node and 
distributed database management systems. We simulate the TPC-C workload to determine expected 
buffer miss rates assuming an LRU buffer management policy. These miss rates are then used as 
inputs to a throughput model. From these models we show the following: (i) We quantify the data 
access skew as specified in the benchmark and show what fraction of the accesses go to what 
fraction of the data, (ii) We quantify the resulting buffer hit ratios for each relation as a function of 
buffer size, (iii) We show that close to linear scale-up (about 3% from the ideal) can be achieved in a 
distributed system, assuming replication of a read-only table, (iv) We examine the effect of packing 
hot tuples into pages and show that significant price/performance benefit can be thus achieved, (v) 
Finally, by coupling the buffer simulations with the throughput model, we examine typical 
disk/memory configurations that maximize the overall price/performance. 
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